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A sampling method, called the chain ratio method, is applied in estimating the distribu- 
tion of mail by destination. Variances and coefficients of variation for the estimators are 
given. The details and results of three applications of this sampling method to outgoing 
first-class letter mail are given. Those studies were conducted by the National Bureau of 
Standards in San Francisco, Los Angeles, and Baltimore. 



1. Introduction 

The National Bureau of Standards has been active 
in developing equipments and systems for improved 
letter sorting by automation. To develop design 
parameters it is necessary to determine the physical 
characteristics of mail and tin 4 proportion of mail 
going to various destinations. 2 Smce the volume of 
mail is much too large for complete piece counts to be 
feasible, sampling methods of known and adequate 
accuracy must be used. The present paper is the 
first step by NBS in the effort to develop such 
methods as applied to mail distribution. Studies 
and results concerning letter-size characteristics are 
reported by Severo, Newman, Young, and Zelen in 
[l] 3 and a general background to the mechanization 
program is given by I. Rot kin [2]. 

This paper discusses a sampling procedure designed 
to estimate the proportion of mail going to each 
destination. The sampling plan used in this study is 
referred to as the "chain-ratio" method because the 
nature of the formulas involved in the analyses 
resembles a chain of ratios. The method has been 
applied to outgoing first class letter-mail at the San 
Francisco, Los Angeles, and Baltimore Post Offices. 

It was intended, initially, to study five cities: 
Baltimore, Washington, Philadelphia, Chicago, and 
Los Angeles. Philadelphia, Baltimore, and Wash- 
ington were chosen because they would tend to give 
a pattern of postal operations on the East Coast. 
Chicago was chosen to show Midwest influence, and 
Los Angeles was selected to show the West Coast 
influence. San Francisco was added to the list in 
an effort to find out whether Los Angeles was 
atypical, because Los Angeles serves an unusually 
large area. 

The Post Office Department made special studies 
in Philadelphia, Chicago, and New York, where in 
each case a complete count was made of the total 
volume of mail to each destination for either a 24- 
or 48-hour period of time. Actually this complete 



1 Present address: University of Buffalo. 

2 Italicized terms have special meanings in this study and are denned in section 
2.1 of this paper or in the Postal Term Glossary, U.S. Post Office Department, 
August 1956, P.O.D. Publication 18. 

3 Figures in brackets indicate the literature references at the end of this paper. 



eount was obtained by footage measurements of 
stacks of mail and a conversion factor of 290 letters 
per foot of mail was used. The NBS also made a 
modified version of the complete eount on November 
5, 1956, in Baltimore. In this count, only the total 
volume entering the system between 4 P.M. and 
7 P.M . was included. 

However, any complete count of large volumes of 
mail, even for short periods of time such as 3 hours, 
involves a considerable number of man hours and 
invariably tends to delay the normal function of 
sorting mail. Furthermore, any such complete 
counts are open to criticisms that may be leveled 
against complete enumeration methods. (The 
literature contains many examples [3, 4, 5, (>] compar- 
ing complete enumeration methods with statistically 
designed sampling procedures, and shows the desira- 
bility, from the economics and reliability point of 
view, of the sampling techniques.) A complete 
count of mail, properly done, say, for 24 hours, 
gives a good indication of what happens during a 
particular Vaes part of a year. If one wishes to 
enlarge 1 this fraction then additional complete 
counts can be made. Thus to represent a particular 
5 / 365 part of a year one might take 5 consecutive 
days — e.g., Monday through Friday or Thursday 
through Monday depending upon whether or not 
the weekend is to be included. This is expensive 
and time consuming. Furthermore tremendous 
effort is needed on the part of all concerned to keep 
track of all the mail to each destination. Thus 
errors are bound to occur. Finally, the mail itself 
will tend to be delayed during such exhaustive 
counts. A sampling study, on the other hand, 
enables one to check the (low pattern of mail from 
time to time during any interval of time and with 
far less effort and disruption to routine operations 
than in a complete enumeration and hence may 
more accurately represent normal operations. Thus, 
for example, to obtain information about mail for 
some given week, samples may be taken for short 
intervals several times each day throughout the 
week. (Actually in the application discussed here, 
two samples a day were taken during a 5-day period 
excluding the weekends.) Or if one wanted to check 
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the behavior of mail for any other given time period, 
say some particular month or during the Christmas 
rush, then samples could be taken from time to time 
during that particular time period. 

The destination data obtained by application of 
the chain-ratio method has been used as basic input 
for: (1) Simulation studies of the effectiveness of 
an NBS proposed sorting machine; (2) studies of 
comparative costs for various types of mechanized 
letter sorting systems, including the one embodied 
in the machine mentioned in (1); (3) analytic com- 
parisons of suggested configurations for automatic 
mail sorting equipment [7] ; and, (4) improvement of 
current sorting procedures. 

Only some typical results of the San Francisco 
study are presented here. The reader is referred to 
[8] for detailed results of the San Francisco, Los 
Angeles, and Baltimore studies. 

Section 2 gives the definitions as used in this 
paper and the model of the flow of mail that is 
studied. Section 3 presents in detail the sampling 
procedures, analysis, and the volume counts used 
for the particular applications discussed. Section 4 
defines precisely the types of mail that were studied 
at San Francisco, Los Angeles, and Baltimore. 
Section 5 presents the details of the San Francisco 
study. 

2. Definitions and the Model 

2.1. Definitions 

A list of definitions of terms, as used in this paper 
is given here for reference. These definitions are 
given in order to avoid misinterpretation and am- 
biguity because of postal language differences 
between post offices. 

1. SEPARATION: a classification characterized by a 

labeled pigeonhole on a sorting case. 

2. DESTINATION: a final separation made at a given 

post office. All directs and residues are included 
in this classification. 4 

3. DIRECT: a destination to a single given post office. 

4. DISTRIBUTION: the function of physically sort- 

ing letters into their respective separation boxes. 

5. PRIMARY: the first stage of distribution of outgoing 

mail. 

6. SECONDARY: the second stage of distribution 

of outgoing mail. 

7. TERTIARY: the third stage of distribution of out- 

going mail. 

8. BYPASS: mail which receives its first distribution 

in the secondary and tertiary cases. Also mail 
which goes directly to the city section. 

9. RESIDUE: mail destined for post offices for which 

no direct separation is provided in a case or rack. 
10. TOTAL VOLUME: the defined classes of mail 
studied. (Total volume is defined more explicitly 
as used in this study in section 4.) 

The expression "off the primary, secondary, or 
tertiary" indicates mail which has just undergone 
that stage of distribution. 



4 Air mail and foreign mail off the primary are also considered destinations in 
this study. 



2.2. The Model 

The model for the operation of outgoing mail 
consists of a three stage sorting scheme which can 
be represented by a flow chart as given in figure 
1. The total volume in the top box consists of those 
types of mail indicated in section 4. This volume 
then divides into two parts, that which goes to the 
primary and that which bypasses the primary. The 
bypass mail is sent either to the city section or to 
the secondary. Mail leaving the primary may go 
either to its destinations or to the secondary. The 
secondary consists of sections which can be numbered 
1, 2, 3, . . . and which correspond to primary separa- 
tions needing further distribution. We call the i-th 
section the "i-th secondary." From any section in 
the secondary, mail can go either to its destination 
or to one of the tertiary sections which can be num- 
bered 1, 2, 3, . . . Therefore sections in the tertiary, 
corresponding to separations from the i-th secondary 
can be numbered i\, i2, %?>,... The ij -th. section is 
called the "ij-th tertiary." Mail leaving the tertiary 
goes directly to its destinations. A more detailed 
description of how a letter flows through this system 
is given in [9]. 

Since the model for incoming mail is similar to 
that for outgoing mail, the procedures discussed 
below may also be applied in studies of incoming 
mail. 
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Figure 1. 



Flow chart model for the distribution of outgoing 
mail. 
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3. Chain -Ratio Estimates 

In this section we discuss the estimation formulas 
and their associated variances and coefficients of 
variation for estimating the proportion of mail to a 
given destination. We also present a list of notations 
and specific formulas used in the applications given 
in section 5 and in [8]. 

3.1. The General Method 

The basic idea involved in the estimation formulas 
consists of multiplying together a chain of ratios. 
Two conditions are required for setting up the chain. 
The first is that each ratio must be one that can be 
estimated conveniently. This can often be done by 
using volume count data customarily recorded by the 
particular post office. If such records are not kept 
then it must be possible either to arrange that they 
be kept or to devise appropriate sampling plans that 
would provide estimates of each ratio. It is essential 
that such plans be simple to implement and not in- 
terrupt the How of the mail. 

The second requirement is that the ratios must he 
linked together in chain form so that the desired ratio 
is all that remains after "canceling." This is similar 
to the usual chain differentiation carried out in the 
calculus. There if it is desired to obtain 8f/8x, where 
f = f[z[y MIL then one writes 

dz^dijdx 

and a "cancellation" check gives the desired results; 
i.e., 

6z^6j/8x 8x 

Such a "cancellation" is, of course, only a convenient 
artifice. It must be proved that the multiplication of 
such a chain of derivatives actually does yield the 
desired derivative bfjbx. 

Here we have a similar situation. Suppose we are 
interested in estimating the ratio of mail to a primary 
destination to the total volume. Let us denote this 
ratio by the parameter D P /T. Suppose that (1) the 
particular post office under study keeps records which 
enable us to obtain an unbiased estimate of the ratio 
of primary mail to the total volume, call this estimate 
Tp/T, and (2) it is possible to set up a simple sampling 
plan which yields an unbiased estimate of the ratio 
of mail to the primary destination to the primary 
volume, call this estimate D P jT P . Then we w r rite 
(Dp/Tp)(T P /T), cancel, as in the calculus, and obtain 
the desired ratio estimate D p /T. That such "cancel- 
lation" is permitted is seen as a special case of the 
following : 

Let R u R 2 , . . . , Rk be a set of K statistically 
independent random variables such that 



Then for any j<K 

EiR.R, . . . Rj)=E(R 1 )E(R 2 ) . . . E{R 3 ) 



E{R t )- 






i=l,2, 



K. 



=^x^x • • • x- 

ri r 2 



Thus RJ\ 2 . . . R) is an unbiased estimate of the 
ratio r /rj. 

It is important to point out that the chain of 
ratios used for a particular application should be 
devised with that application in mind. By so doing, 
optimum use may be made of records already being 
kept by that post office. No single set of formulas 
can be used easily in every case because not all post 
offices maintain the same volume count records. 

The variance of RJl 2 . . . Rk, which we symbolize 
by a 2 Rl . . . r k may be easily obtained. Denote 

the mean and variance of R t by m z - and o\, respec- 
tively. Then, for K =2, 

al^EiRlRD-lEiRA)]* 

= a\<7\+<j\ml+m\(jl. 
Similarly, for 2£=3 and 4 we obtain 

Vr.r.r =o-i<rl<Tl+rr$(rl<rl+olml<Tl 



+0-10-1 mi + m\m\<rl 



and 



- 7ttfcr| ml +0-? 7^ wj> 



^(j\(j\m\o\^G\(x\G\m\^m\m\(j\o\ 
+ m\o-\mla\+m\Gl(jlm\+G\m\ml<j\ 
-^(jlmlalml+alalmlml^rmlmlmldl 
+ 7n\mlalml+m\a\mlm\ J r<j\mlmlml. 



In genera 



°i2i 



K K 



(i) 



where the summation is over all possible 2 K combi- 
nations obtained by letting each t t take on either the 
value m\ or o\. 

Let k t demote the coefficient of variation of R t 

(i.e., ki= — jand let K Btr denote the coefficient 
\ mj i • • • * 

of variation of R X R 2 . . . Rk- Then it follows from 
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eq (1) that 



ttj.. .%=Snw 






(2) 



where now the summation is over all possible 2 K 
combinations obtained by letting each t t take on 
either the value k\ or 1. Thus for the case K=2, 

k% t . . . R =mi+ki+ki 

For &j and & 2 small, we obtain by neglecting terms of 
higher order 

tt A =«+«<2max («,ij). 

Therefore, for ti and k 2 sufficiently small 

k Rl R 2 <-\[2 max (&!, fc 3 ). 
In a similar way it is easy to show that 



£*!... « x <Vi£ maX (*llfe, • • ., fee) 



(3) 



for the k t sufficiently small. This says, essentially, 
what we would intuitively expect; namely that the 
coefficient of variation of the chain is bounded by a 
multiple of the coefficient of variation of the "weakest 
link." This weakest link is that ratio which has 
the greatest percent variability. 

The estimates of the lu used in the applications in 
later sections are of the form X/n where X is either a 
binomial or a multinomial random variable and n is 
the sample size. Such being the case, the coefficient 
of variation of any ratio estimate is k=^j(\-p)j n p 
where p is the expected value of X/n. Since the 
sample sizes used here are large, the value k is indeed 
small. For example if ^ = 0.05 and n = 5,000 then 
the relative standard error of X/n is k = 0.0616 or 
6.2 percent and the absolute standard error is 0.062 X 
0.05 = 0.0031. Thus the overall uncertainty is of 
the order of 3X0.0031 = 0.0093 so that in repeated 
drawings of 5,000 samples we would expect almost all 
of the estimates X/n to be between 0.05 ±0.0093. 

3.2. Notations and Formulas Used in the Applications 

In the preceding section we presented the general 
method for setting up a chain-ratio estimate for the 
percentage of mail to a given destination. In this 
section we give the specific chain -ratio formulas 
used in the San Francisco, Los Angeles, and Balti- 
more studies. We list the notations of the ratios 
involved, and the related formulas for determining 
the percentage of mail to a destination off the Primary, 
Secondary, and Tertiary stages. 

a. Notations 

Iii the discussion of the general method in sec- 
tion 3.1, ratios appear with and without parentheses. 
In the list of notations that follows, all ratios appear 
within parentheses. Throughout this paper we shall 



always denote parameters by ratios without paren" 
theses, and unbiased estimates of these parameters 
will always be denoted by ratios within parentheses. 

( t~ WRatio of mail to a primary destination to 
^ p * the primary volume. Obtained from 
primary samples. 

( t^- WRatio of mail to an i-th secondary to the 
\1 p s total primary volume. Obtained from 
primary samples. 

— - J— Sum of ratios of mail to all secondaries to 
lp total primary volume. Obtained from 

primary samples. 



(: 



7 W Ratio of mail to an i-th secondary in- 
*' eluding bypass mail to the i-ih second- 
ary excluding bypass mail. Obtained 
from volume counts. 



I -—■ 1= Ratio of mail to an i-th secondary desti- 

^ ^ ? : / nation to the i-th secondary. Obtained 

from i-th secondary samples. 

(—■ WRatio of mail to a j-th tertiary (off i-th 
\bi/ secondary) to i-th secondary. Obtained 
from i-th secondary samples. 

/Dt\ 

( — - l=Ratio of mail to a j-th tertiary destination 

(off i-th secondary) to the j-th. tertiary. 

Obtained from the i?-th tertiary 

samples. 

( -y— WRatio of mail to a primary destination to 
^ ' the total volume. Obtained from chain- 

ratio formula. 

/D s \ 

( —^ )=Ratio of mail to an i-th secondary desti- 
^ - nation to the total volume. Obtained 

from chain-ratio formula. 

/D t .\ 

( —— l=Ratio of mail to a j-th tertiary destination 
^ •*■ ' (off i-th secondary) to the total volume. 

Obtained from chain -ratio formula. 

( -^ WRatio of primary mail to the total volume. 
^ / Obtained from volume counts. 

( -~ WRatio of by-pass mail entering at the 
secondary to the total volume. Obtained 
from volume counts. 

( -— WRatio of total secondary mail to total 
^ *■ ' volume. Obtained from volume counts. 
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m 



=Sum of ratios of mail to all destinations 
off the primary to the total volume. 
Obtained from volume counts. 



( 7=r)=Katio of mail to an i-th secondary to the 
\*s/ total secondary volume. Obtained from 



econdary 
volume counts. 



f — -y- J= Ratio of mail to a destination off the 
\2jD p / primary to the sum of all destinations 

off the primary. Obtained from the 

primary samples. 



b. Related Formulas 

Two essentially different sets of formulas were 
used. The choice between the two depended upon 
whether or not the percentage of secondary mail that 
entered the system at each specific secondary case 
was readily available. This often entailed set ting 
up special and difficult procedures for obtaining this 
ratio. In Baltimore we made special volume counts. 
In all cases the aim was to estimate the ratio of mail 
going to a given destination to the sum of primary 
and all bypass mail. 

(1) For Baltimore, where the percentage of bypass 
mail entering the system at the secondary was large, 
the following formulas were used: 

a. For a destination off the primary: 



m<MK 



T 



(4) 



b. For a destination off the secondary: 

(%)=@)x(l)x©- <« 

c. For a destination off the tertiary: 

It is to be noted that formulas (5) and (6) of this 
section depend upon special volume count data that 
give (S t /T a ). 

For examples worked out in detail, see the San 
Francisco study, section 5. 

(2) For both San Francisco and Los Angeles, 
where the percentage of bypass mail entering the 
system at the secondary was very small, no special 
volume counts of mail into the secondary were made. 
Instead, the following formulas were used: 



a. For a destination off the primary: 



(7) 



(8) 



b. For a destination off the secondary: 

(^)=(%)x(l)x© 

©=(g) ' 

c. For a destination off the tertiary: 

(%)Ktt)x©x(I-;)x© 



x K Ls K ',«.; ■ (»> 



(£)=(© 



The justification for eqs (8) and (9) is the following: 
Sel up the chain 

Note that St/Si involves obtaining an estimate of 
the ratio of mail to an i-th secondary including 
bypass mail to the i-th secondary excluding bypass 
mail. As mentioned above this was difficult to 
accomplish in practice. However if Si/S'j = Sj!Sj for 
all i and j, then 

Si 2-jSj ^ 

Si Z-iSi 

The quantity S^/S^ can De written as 
£_p ^-\ Si . B s 

rp 2—1 rp I rp 

(10) 



T P >^y Si 
T^Tp 



Using the "propagation of error" formula, we obtain 
an estimate of (10) as 






(ii) 



Each of the estimates involved in (11) could be 
easily obtained. Thus we have eq (8). Justifica- 
tion for eq (9) follows similarly. 

If the assumption Si/S^SJSj, for all i and j, is 
not true then eq (8) and (9) still apply approximately 
providing the ratio S^/S^J is c l° se to one - 
We confined the use of these formulas to those ap- 
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plications where the ratio of bypass mail to the 
secondary to the total secondary mail was small 
(San Francisco, 0.8 percent; Los Angeles, 2.5 
percent). 

3.3. Methods of Collecting Data 

a. Volume Data 

Certain ratios needed to be established in order 
to relate the pieces of mail counted in each separation 
of the sample to the total volume of mail. It was 
therefore necessary to acquire from volume counts 
in the post office the following data. 

Daily volume information expressed in footage for : 
a. All mail into the primary, b. all mail bypassing the 
primary and entering the secondary; c. all bypass mail 
to the city; d. all mail into each individual type 
secondary case. (This count may not be necessary, 
see section 3.2(b).) 

Items a, b, and c above are normally maintained 
daily by the post office. Item d usually involves 
special volume counts. From the data listed above 
it is possible to determine the ratio of each class and 
type processed to the total volume of mail. Several 
of these ratios are then utilized in the formulas of 
section 3.2(b) to estimate the percentage of the total 
volume going to each destination. These volume 
figures were obtained at least 1 day prior to drawing 
the sample so that decisions regarding the type of 
analysis to be used could be made early. Very 
often the analysis did not make use of certain volume 
ratios, such as those of d above, and therefore the 
particular volume counts could be discontinued . (See 
section 5.1 for example.) 

b. Sample Data 

(1) Primary. Two feet of mail was selected as it 
flowed into the primary cases from the canceling 
machines. It was placed on the ledge of the "test" 
case and distributed by a clerk. Special care was 
taken to make sure that no mail was added to or 
subtracted from the sample. After distribution had 
been made, the contents of each separation box were 
counted by the distributor and recorded by the 
supervising clerk (e.g., see fig. 3). 

Special care was given to the choice of the sample. 
The randomness of the selection of the 2-ft tray was 
assured by choosing the first 2 feet flowing into the 
primary from the canceling machines at the prede- 
termined time for drawing the sample. The mail 
accumulating in the stackers of a cancellation ma- 
chine is fed from a moving conveyor belt that passes 
7 or 8 persons, each of whom faces and places on the 
belt letters selected from those within his reach. 
Thus the letters undergo a fairly thorough mixing as 
they are being stacked so that the letters in any 
tray of mail sampled at this point would tend to 
have the property of randomness which is necessary 
in sampling studies. This method of sampling was 
selected in order to help eliminate the possibility of 
personal bias, conscious or unconscious, or personal 
responsibility for actual allocations. 



However, metered mail and patron segregated mail, 
which does not undergo this mixing process at the 
facing table, was sampled differently. Any "bite" 
or "bunch" of this kind of mail may be addressed to 
the same destination and therefore would not have 
the required property of randomness. In this case 
successive letters were selected every few inches 
apart from each tier of mail until the required 2 feet 
was obtained. The distance between successive 
letters was predetermined and constant. 

Two samples, each of which consists of about 580 
letters, were drawn during the morning peak period 
and 2 during the evening peak period. Samples 
were taken for 5 successive days, exclusive of Satur- 
day and Sunday, in order to obtain a fairly repre- 
sentative picture of the mail throughout the sampling 
period. 

(2) Secondary. Mail flowing into the secondary 
comes either from the primary or from bypass mail. 
Secondary cases do not continuously generate enough 
mail to be sampled at any given moment. Each 
sample was drawn when enough mail was generated. 
In each case the sample used in the study was the 
first 2 feet of mail that accumulated after a case 
had been selected for sampling. After distribution 
had been made, the contents of each separation box 
was counted by the distributor and recorded by the 
supervising clerk. One sample w^as taken in the 
morning peak and one in the evening peak periods 
throughout the week. 

(3) Tertiary. Mail flowing into the tertiary cases 
usually comes from the secondary. Therefore, it 
was possible to make counts on these cases only 
when enough mail was generated. 

However, in cases where the required 2 ft did not 
generate, then smaller samples (i.e., whatever was 
available) were counted. Here again, after distri- 
bution had been made the contents of each separation 
box were counted by the distributor and recorded 
by the supervising clerk. Samples were taken once 
in the morning and once in the evening at peak 
periods throughout the week. 

In order to satisfy the condition of statistical 
independence, we avoided, as much as possible, 
having the same letters represented in samples from 
more than one stage. Care was taken to record 
any mail dispatched during the sample period prior 
to the final count of each destination on primary, 
secondary, and tertiary cases. Thus missing obser- 
vations were avoided. 

4. Type of Mail Studied at San Francisco, 
Los Angeles, and Baltimore 

The total volume of mail studied in the San Fran- 
cisco, Los Angeles, and Baltimore Post Offices may 
be classified as outgoing first class letter mail of the 
following types : 

1 . Cancellation mail (machine and hand) . 

a. Stamped mail to primary 

b . Air mail to primary 

c. Specials to primary 
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d. Stamped mail to secondary bypassing pri- 
mary 

e. Stamped bypass mail to city. 

2. Noneancellation mail 

a. Metered to primary 

b. Metered to secondary bypassing primary 

c. ^4ir mail to primary 

d. Specials to primary 

e. Permit to primary 

f . Permit to secondary bypassing primary 

g. Penalty to primary 

h. Metered said permit bypass to city. 

3. Transit mail 5 

a. Transit to secondary 

b. Transit to city. 

Not included in this study is any typo of incoming 
letter mail nor outgoing first class letter mail of the 
following types : 

1. All mail to air mail and special delivery seel ions 
bypassing primary. 

2. Trans', f mail receiving no distribution. 

3. Large special mailings which would tend to 
bias the sample. 

5. San Francisco Study 

In this section we present a rather detailed descrip- 
tion of the application of the chain-ratio method in 
the study conducted in San Francisco. 

5.1. Volume Count Data 

Volume counts made in San Francisco enabled us 
to determine what percentage of the total volume 
flowed into the primary, how much bypassed the 
primary and flowed either into the city section for 
local distribution or into t he secondary. These counts 
were made on 6 days, June 21 , 24, 25, 26, 27, and 28, 
1957, between the hours of 10 a.m. and 10 p.m. 
Control counts were begun one day prior to drawing 
samples, so that decisions regarding sample size and 
optimum sampling periods and areas could bo made. 
Volume control counts showed that mail flowing 
into the secondary that bypassed the primary was 
less than 1 percent. Thus San Francisco was 
analyzed according to part 2 of section 3.2(b). 
Therefore, it was established early that a footage 
count of mail flowing into the secondary could be 
discontinued. 

Percentages corresponding to the total volume 
figures are summarized in table 1. The flow chart 
given in figure 2 contains the basic proportion figures 
which are then applied in the appropriate formula, as 
well as certain other summary figures that are a 
result of the sampling study. 

5.2. Sampling Procedure 

The sampling procedure adopted for San Francisco 
is the same as that described in section 3.3(b) with 
the modification that, wherever possible, the samples 
were made to consist of equal parts of the following: 



8 Mail received from another post office for outgoing processing. 



Stamped long; stamped short; metered long; and 
metered short letters. This was done because San 
Francisco makes a separation between long and 
short letters which is maintained throughout the 
primary and secondary cases but not, however, in the 
tertiary cases. Furthermore, metered and non- 
metered mail are worked separately throughout the 
primary and secondary cases. Moreover the volume 
of the different classifications were relatively equal. 
The volume of mail generated in the tertiary cases 
was very small during the morning sampling period. 
Therefore, no tertiary samples were taken during 
this period. 

Figure 3 shows copies of sample field data for the 
primary, a typical secondary, and a typical tertiary 
at the San Francisco post office. Each column repre- 
sents samples taken on each of the 5 consecutive 
sampling days. Application of the formulas to an 
example from each stage is shown in section 5.4. 



Table 1. Percentages obtained from volume count data 
supplied by the San Francisco Post Office during the test 
period 



Date 


Primary 


City 


Secondary 






bypass 


bypass 




% 


% 


% 


6-21-57 


84.13 


15.87 


0.00 


24 


89.44 


10. 42 


0.14 


25 


89.59 


9.97 


.46 


26 


85. 66 


14.03 


.31 


27 


85.55 


14.04 


.41 


28 


86.34 


13.40 


.26 


Average % 


86. 74 


13.00 


.26 



TOTAL VOLUME 



MAILING 
PRIMARY 



DESTINATIONS 



CITY 
BY- PASS 







OBTAINED FROM 
SAMPLES 

Q OBTAINED FROM 
VOLUME COUNTS 





Figure 2. San Francisco flow chart. 
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5.3. Computational Formulas 

In this section the computational formulas used 
to estimate the percentage of the total volume of 
mail going to any given destination are given. As 
indicated above the eq (7), (8), and (9) are appropri- 
ate to the San Francisco study. 

a. Primary 

From figure 2 the value of (T p /T) = 0.8674 and 
therefore the appropriate formula becomes: 

(The total number of letters in the samples off the 
primary was 11,196.) 

b. Secondary 

The computational formula for destinations off 
the secondary depends upon the ratios obtained at 



Figure 3. Partial view of sample data for three San Francisco 
cases (worksheets). 



the primary as well as the volume counts. Using 
such ratios gives the formula: 



GX3?) 



x^ 



(€Mt)x 



(t' 



)*(&<¥) 



(f)s(g) 



=©)- 



where the c t are the quantities in brackets which 
depend upon the particular secondary. Values of 
Ci corresponding to particular secondaries are listed 
in table 2. 

These constants actually represent the ratio, as 
estimated by using volume and primary sample 
counts, of a secondary volume of mail to the total 
volume. 
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Table 2. Number of pieces in sample and constants used in 
computational formula for destiiiatious off the secondaries for 
San Francisco 



i 


Si 


Number of 
pieces 


Ci 


1 
2 


Ariz.-N . Mex.-Tex 

IH.-Ind.-Iowa-Mass.-Mich.-Minn__- - 


5, 519 
5, 739 
5, 865 
5, 252 
6, 286 

5, 535 
1,676 

4, 945 
4,499 
4,989 

4,994 
5, 049 
4,759 
4,893 
4,596 


0. 01290 

.01774 


3 


Southern States 


. 01468 


4 


Roekv Mountain States. 


02266 


5 


N.Y.-N J. -Ohio-Pa 


02289 


6 


Canada-Eastern- _ _ 


.01797 


7 
8 


Calif. A-B 

Calif. C-D 


. 02180 
.02367 


9 


Calif. E-G 


.01351 


10 


Calif. H-L 


. 02383 


11 
12 


Calif. M-0 

Calif. P-R 


. 02702 
. 03024 


13 


Calif. S 


. 02031 


14 


Calif. San Santa 


. 03446 


15 


Calif. T-Z 

Total 


. 02203 




77. 596 


0. 32571 









C. Tertiary 

The computational formula for destinations off 
the tertiary depends upon ratios obtained at the 
primary and secondary, as well as the volume counts. 
Using such ratios gives the formula: 



Oft®) 



Xi 



(!;>(g)x(£)x 



m^xm 






where the k tj are the quantities in brackets which 
depend upon the particular tertiary. Values of k tJ 
corresponding to particular tertiaries are listed in 
table 3. 

These constants actually represent the ratio, as 
estimated by using volume counts and primary and 
secondary sample counts, of a tertiary volume of mail 
to the total volume. 



Table 3. Number of pieces in sample and constants used in 
computational formula for destinations off the tertiaries for 
San Fra?Lcisco 



i,j 


tii 


Number of 
pieces 


ka 


7,1 


Calif. A-B 


1, 665 
2, 507 
1,727 
2, 648 
2, 086 
2, 262 
1,118 
2, 152 


0. 00145 


8,1 


Calif. C-D 


. 00277 


9,1 


Calif. E-G 


. 00081 


10, 1 


Calif. H-L 


. 00229 


11,1 


Calif. M-0 


. 00185 


12. 1 


Calif. P-R 


. 00135 


13+14, 1 


Calif. S 


. 00107 


15,1 


Calif T-Z 


00202 




Total 






16, 165 


. 01361 









5.4. Examples 

Applications of the formulas for each stage are 
given here. 

Primary: (Seattle, Wash.) 

D P =\\l pieces— Seattle, Wash. 

7^=11,196 pieces — Total primary 

where the numbers are taken from figure 3. Thus, 

111 



&>m 



X0.8674= 



7rX0.8674=0.0085996. 



11,196 ' 
Secondary: (Bell, Calif.) 
D S =S1 pieces— Bell, Calif. 
#7=4,676 pieces— Total Calif. A-B Secondary 
where the numbers are taken from figure 3. Thus, 

(^ I )=(^)xc 7 = i §gX0.02180=0.0001445 

where the constant c 7 is taken from table 2. 
Tertiary: (Albion, Calif.) 

Df 7 x =20 pieces — Albion, Calif. 

t A i=l,665 pieces- Total Calif. A-B Tertiary 
where the numbers are taken From figure 3. Thus, 

^^V^^)x*7.i=j^gX0.00145=0.0000174 

where k 7A is taken from table 3. 

5.5. Tabulation of Estimated Distribution and 
Observations 

Part of the tabulation of the estimated proportions 
of the total volume mail going to each destination is 
given in table 4. Figure 4 graphically portrays the 





r i i i i i i i 
























// 






/ 


SAN FRANCISCO 

LOS ANGELES 






BALTIMORE 





10 100 120 

DESTINATIONS 



Figure 4. Graph of largest 200 destinations for San Francisco, 
Los Angeles, and Baltimore post offices. 
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Table 4. — Tabulation of estimated percentages of the total vol- 
ume to each direct destination for San Francisco 



Largest 200 direct destinations 



1. San Francisco Inc. City bypass 

2. Oakland, Calif 

3. Los Angeles, Calif 

4. Sacramento, Calif 

5. Washington State 

6. Berkeley, Calif 

7. New York City, N.Y 

8. San Jose, Calif 

9. Seattle, Wash 

10. Oregon State 

11. San Mateo, Calif 

12. Redwood City, Calif 

13. Daly City, Calif 

14. Palo Alto, Calif 

15. Fresno, Calif 

16. Portland, Oreg 

17. South San Francisco 

18. Chicago, 111 

19. San Rafael, Calif 

20. Stockton, Calif 

21. Burlingame, Calif 

22. Menlo Park, Calif 

23. Santa Rosa, Calif 

24. San Diego, Calif 

25. Vallejo, Calif 

* * 

* * 

* * 

196. Wilmington, Calif 

197. Dakeport, Calif 

198. Willits, Calif 

199. Porterville, Calif 

200. Placerville, Calif 



Percent * 



38. 501 
8. 158 
2.789 
1.364 
1.155 

1.147 

1.116 

0.961 

.860 

.775 

.759 
.679 
.670 
.654 
.612 

.605 
.574 
.566 
.521 
.504 

.396 
.394 
.352 
.349 
.295 



0.030 
.030 
.029 
.029 
.029 



Cumulative 
percent 



38. 501 
46. 659 

49. 448 

50. 812 
51.967 

53. 114 
54. 230 
55. 191 
56. 051 

56. 826 

57. 585 

58. 264 

58. 934 

59. 588 

60. 200 

60. 805 
61.379 

61. 945 
62. 466 
62. 970 

63. 366 

63. 760 
64. 112 

64. 461 
64. 756 



79.907 
79. 937 
79. 966 
79. 995 
80. 024 



Rank 



201-204_. 
205-207- 
208-214. . 
215-220- . 
221-225. . 

226-231 -. 
232-239- 
240-249. 
250-256.. 
257-264_. 

265-281 .. 
282-292.. 
293-304.. 
305-321 .. 
322-335.. 

336-360- 
361-380- 
381-401 .. 
402-429- . 
430-467-. 

468-505- 
506-550- 
551-604- 
605-667- 
668-729.. 



730-798-.. 
799-919-.. 
920-1087... 
1088-1271- 
1272-1296.. 



Air mail 

Foreign 

Residues 

Miscellaneous.. 



Number in 
group 



17 
11 
12 
17 
14 

25 
20 
21 
28 
38 

38 

45 
54 
63 
62 

69 
121 
168 
184 

25 



Individual 
percent 



0.029 
.028 
.027 
.026 
.025 

.024 
.023 
.022 
.021 
.020 

.019 
.018 
.017 
.016 
.015 

.014 
.013 
.012 
.011 
.010 

.009 
.008 
.007 
.006 
.005 

.004 
.003 
.002 
.001 
<.001 



Group 
percent 



0.116 
.084 
.189 
.156 
.125 

.144 
.184 
.220 
.147 
.160 

.323 
.198 
.204 
.272 
.210 

.350 
.260 
.252 
.308 
.380 

.342 
.360 
.378 
.378 
.310 

.276 



.184 
.006 

3.200 
0.201 
4.617 
4.743 



Cumula- 
tive percent 



80. 140 
80. 224 
80. 413 
80. 569 
80. 694 

80. 838 
81.022 
81.242 
81.389 

81. 549 

81. 872 

82. 070 
82. 274 
82. 546 

82. 756 

83.106 

83. 366 
83.618 
83. 926 

84. 306 

84.648 
85. 008 
85. 386 

85. 764 

86. 074 



86. 713 

87. 049 
87. 233 
87. 239 

90. 439 
90. 640 
95. 257 
100.000 



» The standard error of the estimated percentages, expressed as percents of the 
estimates, are between 10 and 15 percent for most of the first 200 destinations. 
For the very small percents the standard error may increase to as high as 35 
percent. 



largest 200 destinations by percentage for the Los 
Angeles and Baltimore studies as well as for San 
Francisco. Several observations, based on the 
tabulation, are given here: 

1. The largest 200 destinations received 80 

percent of the total volume. 

2. Seventy-six percent of the total volume re- 

mained in the State of California (not 
including air mail). 

3. Thirty-nine percent of the total volume re- 

mained in San Francisco. 

4. Seven destinations: San Francisco, Oakland, 

Los Angeles, Sacramento, Washington 
State, Berkeley, and New York City were 
the only destinations to receive more than 
one percent of the total volume. 

5. Eighty percent of the total volume remained 

on the West Coast (not including air mail). 

An outstanding feature of the chain-ratio method 
of sampling is that emphasis may be placed on 
estimating relatively small percentages. Adaptation 
of the formulas of section 3.1 shows that the standard 
errors of the estimated percentages of mail to the 
various destinations considered in table 4 expressed 
as percents of the estimates, are between 10 and 15 
percent for most of the first 200 destinations. Thus 
for Oakland, the estimated relative standard error is 
10.4 percent so that the absolute standard error of the 
percentage of San Francisco mail having Oakland 
for its destination is 0.104X8.158 percent=0. 85 
percent so that the overall uncertainty is of the 
order of 3X0.85 percent=2.6 percent and there is 
very little likelihood that it has been misranked in 
order of volume. 

For the examples in the "tail" of the distribution 
cited in section 5.4, the relative standard errors are 
somewhat larger. Thus for Bell, Calif., which ranks 
about 350, the relative standard error is 21 percent. 
Likewise for Albion, Calif., which ranks in the 920 
to 1087 group, the relative standard error is 34 
percent, or 0.0007 percent on an absolute basis, so 
that its overall uncertainty is of the order of ±0.002 
percent and its "true" ranking position mav be as 
high as 730. 

Examination of the complete listings of the San 
Francisco study given here and of the Los Angeles 
and Baltimore studies presented in [8] suggests that 
the proportion of mail to any given destination is 
related to (a) some measure of the "size" of the 
destination, and (b) the distance of the destination 
from the point of origin. Finally, there appears to 
be rather strong evidence that the distribution of 
mail plotted against the ranked destinations is 
rather close to a straight line on log-log paper. 



Among the many colleagues who assisted in 
various ways toward the completion of this study, 
the authors particularly thank Marvin Zelen of 
NBS for his enthusiastic encouragement and helpful 
suggestions and Inspector John Falconer of the Post 
Office Department for assistance in implementing 
and supervising the collection of the data involved 
in the sampling procedures. 
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